3 research outputs found

    High compression rate text summarization

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2008.Includes bibliographical references (p. 95-97).This thesis focuses on methods for condensing large documents into highly concise summaries, achieving compression rates on par with human writers. While the need for such summaries in the current age of information overload is increasing, the desired compression rate has thus far been beyond the reach of automatic summarization systems. The potency of our summarization methods is due to their in-depth modelling of document content in a probabilistic framework. We explore two types of document representation that capture orthogonal aspects of text content. The first represents the semantic properties mentioned in a document in a hierarchical Bayesian model. This method is used to summarize thousands of consumer reviews by identifying the product properties mentioned by multiple reviewers. The second representation captures discourse properties, modelling the connections between different segments of a document. This discriminatively trained model is employed to generate tables of contents for books and lecture transcripts. The summarization methods presented here have been incorporated into large-scale practical systems that help users effectively access information online.by Satchuthananthavale Rasiah Kuhan Branavan.S.M

    Grounding linguistic analysis in control applications

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis. Vita.Includes bibliographical references (p. 175-182).This thesis addresses the problem of grounding linguistic analysis in control applications, such as automated maintenance of computers and game playing. We assume access to natural language documents that describe the desired behavior of a control algorithm, either via explicit step-by-step instructions, via high-level strategy advice, or by specifying the dynamics of the control domain. Our goal is to develop techniques for automatically interpreting such documents, and leveraging the textual information to effectively guide control actions. We show that in this setting, langauge analysis can be learnt effectively via feedback signals inherent to the control application, obviating the need for manual annotations. Moreover we demonstrate how information automatically acquired from text can be used to improve the performance of the target control application. We apply our ideas to three applications of increasing linguistic and control complexity - interpreting step-by-step instructions into commands in a graphical user interface; interpreting high-level strategic advice to play a complex strategy game; and leveraging text descriptions of world dynamics to guide high-level planning. In all cases, our methods produce text analyses that agree with human notions of correctness, while yielding significant improvements over strong text-unaware methods in the target control application.by Satchuthananthavale Rasiah Kuhan Branavan.Ph.D
    corecore